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1. An apparatus for processing data comprising: 

a plurality of individual processing elements arranged in a serial array wherein a 
first processing element precedes a second processing element which precedes an nth 
processing element; and, 

a clock distribution circuit in electrical communication with each processing 
element of the plurality of individual processing elements in the serial array such that, in 
use, a clock signal propagated along the clock distribution circuit arrives at each 
processing element delayed relative to the clock signal arriving at a preceding processing 
element; 

wherein a time equal to an exact number of clock cycles, k, where k is 
greater than zero, from when the data is clocked into a processing element to when the 
data is clocked in by a subsequent processing element is insufficient for providing 
accurate output data from the processing element but wherein the same time with the 
additional delay is sufficient and wherein new data to be processed is clocked in by the 
same processing element after the exact number of clock cycles, k. 

2. The apparatus according to claim 1, the serial array having a first path in a first 
direction and a second path in a second other direction, the second path at each stage 
having a process time shorter than the process time of the first path at each stage. 

3. The apparatus according to claim 2 wherein the clock signal is distributed 
independently to each processing element. 

4. The apparatus according to claim 3 wherein the delay between any two adjacent 
processing elements is approximately a same delay. 

5. The apparatus according to claim 4 wherein the direction of propagation of the 
clock signal is switchable. 
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6. The apparatus according to claim 4 wherein the exact number of clock cycles, k, 
is one clock cycle. 

7. The apparatus according to claim 2 wherein the clock signal is gated from a 
preceding processing element to a next processing element. 

8. The apparatus according to claim 7 wherein the direction of propagation of the 
clock signal is switchable. 

9. The apparatus according to claim 2 wherein at least a processing element of the 
serial array is time-synchronized to an external circuit. 

10. The apparatus according to claim 9 wherein the external circuit includes a 
memory buffer. 

11. The apparatus according to claim 10 wherein the external circuit includes an 
input/output port for receiving data from an external data source and for providing said 
data to the memory buffer. 

12. The apparatus according to claim 11 wherein the serial array comprises: 

a first pipeline array having a first predetermined number of processing elements, n; and, 

a second different pipeline array having a second predetermined number of 
processing elements, m. 

13. The apparatus according to claim 12 wherein at least a processing element of the 
first pipeline array is in electrical communication with the memory buffer via a hardware 
connection, the at least a processing element of the first pipeline array being time- 
synchronized to the memory buffer for retrieving data therefrom. 

14. The apparatus according to claim 13 wherein the at least a processing element of 
the first pipeline array is a first processing element of the first pipeline array. 
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15. The apparatus according to claim 13 wherein the nth element of the first pipeline 
array and the mth element of the second pipeline array are in electrical communication 
via a hardware connection, such that data having been provided to the first processing 

5 element of the first pipeline array and propagated to the nth processing element thereof is 
further propagated to the mth processing element of the second pipeline array for 
additional processing therein. 

16. The apparatus according to claim 15 wherein the first predetermined number of 
10 processing elements, n, and the second predetermined number of processing elements, m 

are a same predetermined number of processing elements and wherein, in use, the delay 
to the nth element and to the mth element is approximately equal such that a tail-to-head 
data transfer between the nth element of the first pipeline array and the mth element of 
CQ the second pipeline array is substantially time-synchronized. 

? - 17. The apparatus according to claim 13 wherein at least a processing element of the 

en 

second pipeline array is in electrical communication with the memory buffer via a second 
.L hardware connection, the at least a processing element of the second pipeline array being 
time-synchronized to the memory buffer for retrieving data therefrom. 

3 20 

H 18. The apparatus according to claim 17 wherein the at least a processing element of 
the second pipeline array is a first processing element of the second pipeline array. 

19. The apparatus according to claim 17 wherein the nth element of the first pipeline 
25 array and the mth element of the second pipeline array are in electrical communication 
via a hardware connection, such that data having been provided to the first processing 
element of the first pipeline array and propagated to the nth processing element thereof is 
further propagated to the mth processing element of the second pipeline array for 
additional processing therein. 
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20. The apparatus according to claim 17 comprising a third pipeline array having a 
third predetermined number of processing elements, q. 

21. The apparatus according to claim 20 wherein at least a processing element of the 
third pipeline array is in electrical communication with the memory buffer via a third 
hardware connection, the at least a processing element of the second pipeline array being 
time-synchronized to the memory buffer for retrieving data therefrom. 

22. The apparatus according to claim 21 wherein the at least a processing element of 
the third pipeline array is a first processing element of the third pipeline array. 

23. The apparatus according to claim 21 wherein the nth element of the first pipeline 
array and the mth element of the second pipeline array are in electrical communication 
via a first hardware connection, and the first element of the second pipeline array and the 
first element of the third array are in electrical communication via a second hardware 
connection, such that that a tail-to-head data transfer between the nth element of the first 
pipeline array and the mth element of the second pipeline array is substantially time- 
synchronized and such that a head-to-tail data transfer between the first element of the 
second pipeline array and the first element of the third pipeline array is substantially 
time-synchronized. 

24. The apparatus according to claim 12 comprising a third pipeline array having a 
third predetermined number of processing elements, q. 

25. The apparatus according to claim 24 wherein the nth element of the first pipeline 
array and the mth element of the second pipeline array are in electrical communication 
via a first hardware connection, and the first element of the second pipeline array and the 
first element of the third array are in electrical communication via a second hardware 
connection. 

26. A switchable processing element comprising: 
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a first port for receiving a first clock signal; 

a second port for receiving a second other clock signal; 

a switch operable between two modes for selecting one of the first clock signal 
and the second other clock signal; and 
5 wherein the selected one of the first clock signal and the second other clock signal 

is provided to the processing element. 

27. A method for processing data comprising the steps of: 

(a) providing a pipeline processor including a plurality of individual 
10 processing elements arranged in a serial array such that a first processing element 

precedes a second processing element which precedes an nth processing element; 

(b) providing a clock signal to each processing element of the plurality of 
B individual processing elements in the serial array such that the clock signal arrives at each 
CQ individual processing element beyond the first processing element delayed relative to the 

■ ^ 

*Vl5 clock signal arriving at a preceding processing element; 

pes 

M (c) providing data to the first processing element for processing therein; and, 

in 

(d) propagating the data to at least a next processing element for additional 
jL processing therein, 

in wherein the clock signal provided to an element in the plurality of 

^20 individual processing elements is delayed relative to the clock signal provided to another 

element of the plurality of individual processing elements by a substantial amount 

relative to the clock period. 

28. A method according to claim 27 wherein a time equal to an exact number of clock 
25 cycles, n, where n>0 from when the data is provided to the first processing element to 

when the data is propagated to the at least a next processing element is insufficient for 
providing accurate output data from the first processing element but wherein the same 
time with the additional delay is sufficient and wherein new data to be processed is 
provided to the first processing element after the exact number of clock cycles, n. 
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29. The method according to claim 27 wherein the at least a next processing element 
propagates data in a second other processing direction away from the first processing 
element for additional processing therein. 

30. The method according to claim 29 wherein the step of providing data comprises 
the steps of: 

synchronizing the first processing element to an external circuit, the external 
circuit for receiving the data for processing by the first processing element from an 
external source; and, 

reading the data for processing by the first processing element from the external 

circuit. 

31. The method according to claim 30 wherein the external circuit is a memory buffer 
for receiving the data for processing by the first processing element. 

32. The method according to claim 29 wherein one of the first and second direction 
requires a shorter processing time relative to the other. 

33. The method according to claim 32 wherein the clock signal is distributed 
independently to each processing element. 

34. The method according to claim 33 wherein the exact number of clock cycles, k, is 
one clock cycle. 

35. The method according to claim 33 wherein the delay between any two adjacent 
elements is approximately a same delay. 

36. The method according to claim 33 wherein the delay plus the exact number of 
clock cycles is a longer period of time than the processing time in the direction of delay. 
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37. The method according to claim 36 wherein the exact number of clock cycles 
minus the delay is a longer period of time than the processing time in the direction other 
than the direction of delay but a shorter period of time than the processing time in the 
direction of the delay. 

38. The method according to claim 37 wherein the clock cycle is at least an average 
of the processing times in each direction. 

39. The method according to claim 32 wherein the clock signal is gated from a 
preceding processing element to a next processing element, each processing element 
having therein circuitry for causing a known delay in the clock signal. 

40. The method according to claim 32 wherein the data is provided for encryption to 
the pipeline processor. 

41. A method for processing data within a pipeline processor comprising the steps of: 

(a) providing a clock signal in a first direction along a first portion of the 
pipeline processor having a number, n, processing elements such that the clock signal 
arrives at each individual processing element beyond the first processing element of the 
first portion delayed relative to the clock signal arriving at a preceding processing 
element of the same first portion; 

(b) providing a clock signal in a second substantially opposite direction along 
a second other portion of the pipeline processor having a same number, n, processing 
elements such that the clock signal arrives at each individual processing element beyond 
the first processing element of the second other portion delayed relative to the clock 
signal arriving at a preceding processing element of the same second other portion; 

(c) providing data to the first processing element of the first portion of the 
pipeline processor for processing therein; 



approximately same delay as the delay to the last processing element of the second 



wherein the delay to the last processing element of the first portion is an 
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portion, such that at center of the pipeline processor the two adjacent processing elements 
are in synchronization. 

42. The method according to claim 41 wherein the data is provided for encryption by 
the pipeline processor. 

43. A macro for use in layout of an apparatus for processing data comprising: 

a plurality of individual processing elements arranged serially and having a clock 
input conductor and a clock output conductor, the clock input conductor in 
communication with a clock conductor having increased length from the clock input 
conductor to each subsequent element within the within the plurality of individual 
processing elements and wherein the clock conductor has decreased length from the clock 
output conductor to each subsequent element within the within the plurality of individual 
processing elements, 



that adjacently placed macros form space efficient blocks within a layout and such that 
the input clock conductor of one macro and the out clock conductor of an adjacent macro 
when coupled have approximately a same conductor path length as the conductor path 
length between adjacent elements within a same macro when the macros are disposed in a 
predetermined space efficient placement. 



wherein the clock input conductor and output conductor are arranged such 
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